25 research outputs found

    Self-Aware Thermal Management for High-Performance Computing Processors

    Get PDF
    Editor's note: Thermal management in high-performance multicore platforms has become exceedingly complex due to variable workloads, thermal heterogeneity, and long, thermal transients. This article addresses these complexities by sophisticated analysis of noisy thermal sensor readings, dynamic learning to adapt to the peculiarities of the hardware and the applications, and a dynamic optimization strategy. - Axel Jantsch, TU Wien - Nikil Dutt, University of California at Irvine

    From the Amelioration of a NADP+-dependent Formate Dehydrogenase to the Discovery of a New Enzyme: Round Trip from Theory to Practice

    Get PDF
    NADP+-dependent formate dehydrogenases (FDHs) are biotechnologically relevant enzymes for cofactors regeneration in industrial processes employing redox biocatalysts. Their effective applicability is however hampered by the low cofactor and substrate affinities of the few enzymes described so far. After different efforts to ameliorate the previously studied GraFDH from the acidobacterium Granulicella mallensis MP5ACTX8, an enzyme having double (NAD+ and NADP+) cofactor specificity, we started over our search with the advantage of hindsight. We identified and characterized GraFDH2, a novel highly active FDH, which proved to be a good NAD+-dependent catalyst. A rational engineering approach permitted to switch its cofactor specificity, producing an enzyme variant that displays a 10-fold activity improvement over the wild-type enzyme with NADP+. Such variant resulted to be one of the best performing enzyme among the NADP+-dependent FDHs reported so far in terms of catalytic performance

    Experimenting with Emerging ARM and RISC-V Systems for Decentralised Machine Learning

    Full text link
    Decentralised Machine Learning (DML) enables collaborative machine learning without centralised input data. Federated Learning (FL) and Edge Inference are examples of DML. While tools for DML (especially FL) are starting to flourish, many are not flexible and portable enough to experiment with novel systems (e.g., RISC-V), non-fully connected topologies, and asynchronous collaboration schemes. We overcome these limitations via a domain-specific language allowing to map DML schemes to an underlying middleware, i.e. the \ff parallel programming library. We experiment with it by generating different working DML schemes on two emerging architectures (ARM-v8, RISC-V) and the x86-64 platform. We characterise the performance and energy efficiency of the presented schemes and systems. As a byproduct, we introduce a RISC-V porting of the PyTorch framework, the first publicly available to our knowledge

    On-line thermal emulation: How to speed-up your thermal controller design

    No full text
    Dynamic thermal management (DTM) is a key technology for future many-core systems. Indeed systems, as both server-class and embedded chip multiprocessors are thermally constrained. DTM design requires consideration for the chain of interactions between HW operating points, workload phases, power consumption, die temperature, HW monitor infrastructure, control policy. Hugely different time scales are involved, from microseconds to hours. Simulating performance of DTM solutions for a many-core system in a reasonable time is an open problem. In this paper we present an on-line thermal emulation framework based on the Intel Single-Chip-Cloud computer. In our framework a subset of the cores are used to on-line emulate the evolution of a generic thermal floorplan based on the real workload usage and operating point selected by the rest of the cores which emulate the target managed system. This enables design space exploration of dynamic thermal management solutions at the speed of real workload execution

    A scalable framework for online power modelling of high-performance computing nodes in production

    No full text
    Power and thermal design and management are critical components of high performance computing (HPC) systems, due to their cutting-edge position in terms of high power density and large total power consumption. Many HPC power management strategies rely on the availability of accurate compact power models, capable of predicting power consumption and tracking its sensitivity to workload parameters and operating points. In this paper we describe a methodology and a framework for training power models derived with two of the best-in-class procedures directly on the online in production nodes and without requiring dedicated training instances. The compact power models are obtained using an online regression-based approach which can track non-stationary workloads and hardware variability. Our experiments on a real-life HPC system demonstrate that the models achieve very high accuracy over all operating modes. We also demonstrate the scalability of our approach and the small amount of resources needed for the online modeling, for both the training and inference phases

    Cooling-aware node-level task allocation for next-generation green HPC systems

    No full text
    Energy-efficiency is of primary interest in future HPC systems as their computational growth is limited by the supercomputer peak power consumption. A significant part of the power consumed by a supercomputer machine is caused by the cooling infrastructure. Todays thermal design is based on coarse grain models which consider the silicon die of the processing elements as an isothermal surface. Similarly feedback control loops uses the same assumption to modulate the cooling effort with the goal of reducing cooling cost and maintaining the silicon temperature in a safe working range. Recent processors development has brought into the market CPUs that integrate a large number of complex cores. Differently from massively parallel CPUs for which the area and power consumption of each core is very limited, the cores of these processors can consume tens of watts and thus, under heterogeneous workloads, creating significant thermal gradients. In this paper we first characterize the power and thermal characteristics of new server-class Intel Xeon computing node based on Haswell v3 architecture considering both the computational and the cooling components. We show that these systems are characterized by significant on-die thermal gradients and that the current O.S. Task allocation strategy is not capable of taking advantage of that, leading to max CPU temperature and extra cooling activity. To solve this issue we propose a novel task allocation strategy that reduces the cooling power while matching the HPC performance requirements
    corecore